Evaluation of influence of spectral and prosodic features on GMM classification of Czech and Slovak emotional speech
نویسندگان
چکیده
This article analyzes and compares influence of different types of spectral and prosodic features for Czech and Slovak emotional speech classification based on Gaussian mixture models (GMM). Influence of initial setting of parameters (number of mixture components and used number of iterations) for GMM training process was analyzed, too. Subsequently, analysis was performed to find how correctness of emotion classification depends on the number and the order of the parameters in the input feature vector and on the computation complexity. Another test was carried out to verify the functionality of the proposed two-level architecture comprising the gender recognizer and of the emotional speech classifier. Next tests were realized to find dependence of some negative aspect (processing of the input speech signal with too short time duration, the gender of a speaker incorrectly determined, etc.) on the stability of the results generated during the GMM classification process. Evaluations and tests were realized with the speech material in the form of sentences of male and female speakers expressing four emotional states (joy, sadness, anger, and a neutral state) in Czech and Slovak languages. In addition, a comparative experiment using the speech data corpus in other language (German) was performed. The mean classification error rate of the whole classifier structure achieves about 21% for all four emotions and both genders, and the best obtained error rate was 3.5% for the sadness style of the female gender. These values are acceptable in this first stage of development of the GMM classifier. On the other hand, the test showed the principal importance of correct classification of the speaker gender in the first level, which has heavy influence on the resulting recognition score of the emotion classification. This GMM classifier should be used for evaluation of the synthetic speech quality after applied voice conversion and emotional speech style transformation.
منابع مشابه
Determination of Formant Features in Czech and Slovak for GMM Emotional Speech Classifier
The paper is aimed at determination of formant features (FF) which describe vocal tract characteristics. It comprises analysis of the first three formant positions together with their bandwidths and the formant tilts. Subsequently, the statistical evaluation and comparison of the FF was performed. This experiment was realized with the speech material in the form of sentences of male and female ...
متن کاملClassification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملA Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation
Abstract Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...
متن کاملStatistical Analysis of Spectral Properties and Prosodic Parameters of Emotional Speech
95 Statistical Analysis of Spectral Properties and Prosodic Parameters of Emotional Speech J. Přibil 1, 2 and A. Přibilová 3 1 Institute of Photonics and Electronics, Academy of Sciences CR, v.v.i., Chaberská 57, CZ-182 51 Prague 8, Czech Republic 2 Institute of Measurement Science, SAS, Dúbravská cesta 9, SK-841 04 Bratislava, Slovakia, [email protected] 3 Slovak University of Technology, Facu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- EURASIP J. Audio, Speech and Music Processing
دوره 2013 شماره
صفحات -
تاریخ انتشار 2013